Add Rails comparison: Action Cable / Solid Cable / AsyncCable / AnyCable#2
Open
irinanazarova wants to merge 18 commits into
Open
Add Rails comparison: Action Cable / Solid Cable / AsyncCable / AnyCable#2irinanazarova wants to merge 18 commits into
irinanazarova wants to merge 18 commits into
Conversation
Target apps and harness for the Rails WebSocket adapter comparison behind
anycable.io/compare/rails-actioncable.
Targets:
- cable-bench/ Rails 8.1 app, BENCH_MODE selects Action Cable (Redis)
or Solid Cable (database); also the AnyCable RPC backend
- cable-bench-falcon/ same app booted on Falcon via actioncable-next +
async-cable (the AsyncCable target)
Harness:
- idle-multi.ts: forward CHANNEL/AC_PROTOCOL and send the bench-runner auth
token, so idle/capacity runs can target a real Rails channel
- jitter-multi.ts: forward CHANNEL + AC_PROTOCOL for Rails targets
- idle-runner.ts / jitter-runners.ts / server.ts: channel + acProtocol params
- tests-manifest.ts: Rails latency/jitter/idle/avalanche/capacity specs
Results (sharded, one shared-tenant Railway window):
- backend/results/rails-sharded-2026-06-28.json latency/jitter/10K/idle/avalanche
- backend/results/rails-capacity-break-2026-06-28.json idle-to-break per adapter
Deep dive in docs/rails-comparison.md; summary in README.
Encode each stream payload once per channel identifier instead of once per subscriber (~2x faster broadcasts). Stock Action Cable has no equivalent, so this measures the optimized actioncable-next path on the Async::Cable/Falcon target.
actioncable-next fastlane sends pre-encoded frames via Socket#raw_transmit, which async-cable 0.3.1's Socket does not implement; without this shim every fastlane broadcast raises NoMethodError and delivery is 0%.
throughput.ts / bench-runner /bench-throughput-anycable / throughput-multi.ts now accept channel + acProtocol, mirroring the jitter path, so the throughput suite can target a Rails BenchmarkChannel over the base protocol instead of only anycable-go $pubsub over the extended protocol.
The runner reads req.query.intervalMs; throughput-multi sent 'interval', so every run silently used the 100ms default and ignored the requested rate. Coordinator-only fix; no runner rebuild needed.
Pin async-cable to @27181dff1 (native Socket#raw_transmit, Rails 8.1 compatible) and drop the raw_transmit shim. Vendor Async::Cable::Executor (from async-cable dddef54c, whose released form requires edge Rails 8.2) and install it via ActionCable::Server::Base#executor, so broadcast-delivery callbacks (SubscriberMap::Async#invoke_callback -> executor.post) run on the reactor instead of bouncing through Action Cable's thread pool. This is the documented fix for Falcon broadcast latency; re-measure vs the 0.3.1 numbers.
Stock Rails puma.rb omits the workers directive, so WEB_CONCURRENCY was ignored and the Action Cable target ran a single Puma process regardless of the env var. Set it explicitly so Puma's process count matches the Falcon target's falcon --count for a matched WS-engine comparison.
…/cableUrl) Was hardcoded to /bench-avalanche-socketio with no Rails param passthrough. Now selects /bench-avalanche-<protocol> (defaults to anycable for rails-* services) and forwards channel/acProtocol/cableUrl, so the deploy-survival test can drive Action Cable / Solid Cable / Async::Cable / AnyCable.
Adapts avalanche-multi-uws.ts to the /bench-avalanche-anycable endpoint so the post-redeploy reconnect storm is generated across many bench-runners (~250 clients each) instead of one Node process, removing the load-generator limit on the deploy-survival test. Fires one serviceInstanceRedeploy, aggregates time-to-95%-reconnect across shards. prearm/recovery tuned under Railway's 5-min proxy timeout.
The @anycable/core client's default reconnect backoff is multi-second, so the resume-tail p99 after a transient drop is dominated by reconnect wait, not server delivery. Add reconnectBaseMs (job param + RECONNECT_BASE_MS env): first reconnect fires in ~base ms (then x2 up to 5s). Set ~200 to collapse the tail.
The AnyCable jitter loop force-closed the socket then waited jitterDurationMs, during which the client's Monitor reconnected on its backoff -> the offline period was the backoff delay, not a fixed outage, so delivery depended on the client's reconnect config. Now re-terminate any reconnect until the window elapses, so it measures a standard 2s drop; the client backoff only governs recovery speed after the outage.
Re-terminating fought the Monitor (flapping reconnects + backoff escalation). Instead cable.disconnect() emits close -> Monitor cancels reconnect, client stays cleanly offline for the outage window (sid retained), then connect() reconnects once and AnyCable resumes. Outage length is now fixed and backoff-independent, a true standard network drop.
Add @rails/actioncable and a clientLib=actioncable path in the jitter runner so Action Cable / Solid Cable / Async::Cable are driven by the official Rails client (base protocol, its own reconnect monitor, no resume), while AnyCable keeps @anycable/core (extended protocol, resume). Realistic per-server client instead of using @anycable/core for everything.
Its ConnectionMonitor calls addEventListener/removeEventListener and reads document.visibilityState, which don't exist in Node -> ReferenceError. Provide no-op stubs before the client loads.
… Node shim Jitter: Action Cable family now drops the socket uncleanly and recovers on the official client's own poll-based monitor (native, seconds) rather than a forced immediate reconnect. Avalanche: same clientLib branch so deploy-survival uses each server's real client. Extract the Node WebSocket+globals shim to a shared module.
Strip the half-committed Centrifugo work that leaked into this Rails PR: server.ts imported ../lib/centrifugo-runners.js (never committed), so the bench-runner failed to build. Remove the centrifugo endpoints, env config, and centrifugoUrls helper from server.ts, the centrifugo protocol wiring from jitter-multi/throughput-multi, and the dev:centrifugo/smoke:centrifugo scripts from package.json. Keep the curated Rails result files in the repo (README/docs cite them): un-ignore backend/results/rails-*.json and socketioxide-*.json, and add rails-capacity-break-2026-06-28.json (previously referenced but untracked). Raw per-run dumps stay ignored.
- jitter-runners: add destroy() to the JitterConn surface and use it at teardown. The @rails/actioncable path's disconnect() only closes the socket (leaving the ConnectionMonitor polling), which is correct for the in-run outage but orphaned a reconnecting consumer in the long-lived bench-runner at end of run. destroy() calls consumer.disconnect(), which also stops the monitor. - jitter-runners: document the outage asymmetry explicitly — @anycable/core gets a fixed jitterDurationMs offline window; @rails/actioncable stays down for jitterDurationMs + its native monitor's reconnect latency, so its delivery reflects both no-resume and real client recovery time. - avalanche-anycable-runner: track per-connection up/down state so a single drop that fires both "disconnect" and "close" (or any repeat event) counts once, instead of double-incrementing disconnected. - README: results/ note now reflects tracked published files vs ignored dumps. - async_cable_executor: note the dependency on ActionCable::Server::Base's @mutex/@executor ivars.
palkan
reviewed
Jul 2, 2026
| logLevel: "error" as never, | ||
| ...(urls.reconnectBaseMs && urls.reconnectBaseMs > 0 | ||
| ? { | ||
| reconnectStrategy: backoffWithJitter(urls.reconnectBaseMs, { |
Member
There was a problem hiding this comment.
I think, we'd better provide a custom reconnectStategy function matching the Action Cable's default: https://github.com/rails/rails/blob/bf13f50eb663a1ad0a6e996634c9e298149f088d/actioncable/app/javascript/action_cable/connection_monitor.js#L74-L80
It's gonna be smth like:
reconnetStrategy: function(attempts) {
const staleThreshold = 6
const reconnectionBackoffRate = 0.15
const backoff = Math.pow(1 + reconnectionBackoffRate, Math.min(attempts, 10))
const jitterMax = attempts === 0 ? 1.0 : reconnectionBackoffRate
const jitter = jitterMax * Math.random()
return staleThreshold * 1000 * backoff * (1 + jitter)
}Or we can experiment with other custom function (but not trying to tweak our backoffWithJitter—it's getting too cryptic).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a fourth comparison to the bench: Rails WebSocket adapters (Action Cable, Solid Cable, Async::Cable, AnyCable), all speaking the Action Cable API, measured on the same sharded fleet as the Node.js suite.
Targets
cable-bench/(Puma): one image, three modes viaBENCH_MODE(actioncable= Redis adapter,solidcable= DB polling,anycable= gRPC RPC backend +anycable-gogateway).cable-bench-falcon/(Falcon): Async::Cable viaasync-cable+actioncable-next.Adapter fixes to measure each fairly
actioncable-nextfastlane broadcasts callSocket#raw_transmit, whichasync-cable0.3.1 lacks (0% delivery). Pinasync-cableto the commit that adds it, and vendor the fiber-basedAsync::Cable::Executor(its released form requires edge Rails 8.2) so broadcast dispatch stays on the reactor instead of thread-hopping.puma.rbomits theworkersdirective, soWEB_CONCURRENCYwas ignored and Puma ran a single worker. Honor it, so process count matches Falcon's--countfor a matched comparison.Harness: Rails support in the coordinators
jitter-multi/throughput-multi/avalanche-multi: forwardchannel+acProtocol+cableUrlso the drivers hit a real RailsBenchmarkChannelover the base or extended protocol (not only anycable-go$pubsub).avalanche-multi-anycable: sharded deploy-survival (many runners generate the reconnect storm, oneserviceInstanceRedeploy, aggregate time-to-95%), so the recovery number is not load-generator-limited.throughput-multipublish-rate key (interval->intervalMs).Realistic client per server
@rails/actioncable(the official Rails client, base protocol, native poll-based reconnect, no resume) for Action Cable / Solid Cable / Async::Cable;@anycable/core(extended protocol, resume) for AnyCable. Driving Action Cable with AnyCable's client flatters its reconnect; using each server's real client is the honest comparison. Includes a small Node shim (WebSocket adapter + browser-global stubs).reconnectBaseMs) on the@anycable/coreclient.disconnect()/connect()for@anycable/core; native socket drop + native monitor recovery for@rails/actioncable), so delivery reflects a real ~2s drop rather than the client's backoff.Headline findings (matched, sharded)
Raw data:
backend/results/rails-sharded-2026-06-28.jsonandbackend/results/rails-capacity-break-2026-06-28.json. Full write-up indocs/rails-comparison.md.anycable-gogateway holds connections across a Rails RPC-backend redeploy), the in-process trio ~7.5–8s down to ~96% reconnect.